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IN THE CLAIMS 

1 . (Currently amended) A method for monitoring abnormalities in a data stream, comprising 
the steps of: 

receiving a plurality of objects in the data stream; 

creating one or more clusters from the plurality of objects, wherein at least a portion of each 
of the one or more clusters comprises statistical data representative of the respective cluster, wherein 
the statistical data comprises a time-sensitive weight for each of the plurality of objects in each of the 
one or more clusters, the time-sensitive weight having a value that decreases at a specified rate such 
that more recently received objects are assigned a higher priority, and wherein the one or more 
clusters are condensed for maintenance at a high level of granularity as one or more cluster droplets; 

determining from the statistical data whether each of the one or more clusters is abnormal 
when compared to , wherein a cluster is abnormal when no objects in the data stream are added to the 
cluster prior to the time-sensitive weights of the cluster decreasing to a predefined value; and 

reporting at least one of the one or more clusters as an abnormal cluster of objects in the data 

stream. 

2. (Original) The method of claim 1 , wherein the step of creating one or more clusters further 
comprises: 

computing one or more similarity values for a given object relating to one or more existing 
clusters; and 

determining a closest cluster for the object based on the one or more similarity values. 

3. (Original) The method of claim 2, further comprising the steps of: 
determining whether to add the object to the closest cluster; 

adding the object to the closest cluster when determined and updating the statistical data of 
the closest cluster; and 

creating a new cluster comprising the object when the object is not added to the closest 
cluster, and generating statistical data of the new cluster. 
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4. (Original) The method of claim 3, wherein the step of determining whether to add the 
object to the closest cluster further comprises the step of determining if the similarity value is greater 
than a user-defined threshold. 

5. (Previously presented) The method of claim 1 , wherein the step of determining from the 
statistical data whether each of the one or more clusters is abnormal comprises the steps of: 

determining which clusters present at a first time were not present at a second time, wherein 
the second time is before the first time; 

determining which of the clusters, present at the first time and not present at the second time, 
contain fewer than a user-defined number of objects; and 

reporting clusters with fewer than the user-defined number of objects as abnormalities. 

6. (Original) The method of claim 1, wherein the statistical data of each cluster is stored 
using an incremental updating process. 

7. (Original) The method of claim 1, wherein the statistical data of each cluster comprises 
one or more statistical counts of each pairwise attribute. 

8. (Original) The method of claim 1, wherein the statistical data of each cluster comprises 
one or more statistical counts of each categorical attribute. 

9. (Original) The method of claim 1 , wherein the statistical data of each cluster comprises a 
number of objects in each cluster. 

10. (Original) The method of claim 1, wherein the statistical data is stored periodically at 
intervals chosen based on a pyramidal distribution. 



3 



Attorney Docket No. YOR920040039US1 

11. (Original) The method of claim 1, wherein the step of creating one or more clusters 
further comprises the step of applying one or more weights to one or more attributes. 

12. (Original) The method of claim 1, wherein abnormalities comprise intrusions in a 
network. 

13. (Original) The method of claim 12, wherein the step of receiving a plurality of objects 
further comprises the step of collecting source IP (Internet Protocol) address data, destination IP 
address data and signature data. 

14. (Original) The method of claim 12, wherein the step of creating one or more clusters 
further comprises the step of clustering source IP address data, destination IP address data and 
signature data. 

1 5 . (Previously presented) The method of claim 1 2, wherein the step of determining from the 
statistical data whether each of the one or more clusters is abnormal comprises the step of detecting 
one or more intrusions from statistical data of source IP address data, destination IP address data and 
signature data. 

16. (Currently amended) Apparatus for monitoring abnormalities in a data stream, 
comprising: 

a memory; and 

at least one processor coupled to the memory and operative to: (i) receive a plurality of 
objects in the data stream; (ii) create one or more clusters from the plurality of objects, wherein at 
least a portion of each of the one or more clusters comprises statistical data representative of the 
respective cluster, wherein the statistical data comprises a time-sensitive weight for each of the 
plurality of objects in each of the one or more clusters, the time-sensitive weight having a value that 
decreases at a specified rate such that more recently received objects are assigned a higher priority, 
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and wherein the one or more clusters are condensed for maintenance at a high level of granularity as 
one or more cluster droplets; and (iii) determine from the statistical data whether each of the one or 
more clusters is abnormal when compared to 3 wherein a cluster is abnormal when no objects in the 
data stream are added to the cluster prior to the time-sensitive weights of the cluster decreasing to a 
predefined value. 

17. (Original) The apparatus of claim 16, wherein the operation of creating one or more 
clusters further comprises: 

computing one or more similarity values for a given object relating to one or more existing 
clusters; and 

determining a closest cluster for the object based on the one or more similarity values. 

18. (Original) The apparatus of claim 17, further comprising: 
determining whether to add the object to the closest cluster; 

adding the object to the closest cluster when determined and updating the statistical data of 
the closest cluster; and 

creating a new cluster comprising the object when the object is not added to the closest 
cluster, and generating statistical data of the new cluster. 

19. (Original) The apparatus of claim 1 8, wherein determining whether to add the object to 
the closest cluster further comprises determining if the similarity value is greater than a user defined 
threshold. 

20. (Previously presented) The apparatus of claim 1 7, wherein the operation of determining 
from the statistical data whether each of the one or more clusters is abnormal further comprises: 

determining which clusters present at a first time were not present at a second time, wherein 
the second time is before the first time; 
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determining which of the clusters, present at the first time and not present at the second time, 
contain fewer than a user defined number of objects; and 

reporting clusters with fewer than a defined number of objects as abnormalities. 

2 1 . (Original) The apparatus of claim 1 6, wherein the statistical data of each cluster is stored 
using an incremental updating process. 

22. (Original) The apparatus of claim 16, wherein the statistical data of each cluster 
comprises one or more statistical counts of each pairwise attribute. 

23. (Original) The apparatus of claim 16, wherein the statistical data of each cluster 
comprises one or more statistical counts of each categorical attribute. 

24. (Original) The apparatus of claim 16, wherein the statistical data of each cluster 
comprises a number of objects in each cluster. 

25. (Original) The apparatus of claim 1 6, wherein the statistical data is stored periodically at 
intervals chosen based on a pyramidal distribution. 

26. (Original) The apparatus of claim 16, wherein the operation of creating one or more 
clusters further comprises applying one or more weights to one or more attributes. 

27. (Original) The apparatus of claim 16, wherein abnormalities comprise intrusions in a 
network. 

28. (Original) The apparatus of claim 27, wherein the operation of receiving a plurality of 
objects further comprises collecting source IP address data, destination IP address data and signature 
data. 
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29. (Original) The apparatus of claim 27, wherein the operation of creating one or more 
clusters further comprises clustering source IP address data, destination IP address data and signature 
data. 

30. (Previously presented) The apparatus of claim 27, wherein the operation of determining 
from the statistical data whether each of the one or more clusters is abnormal further comprises 
detecting one or more intrusions from statistical data of source IP address data, destination IP address 
data, and signature data. 

3 1 . (Currently amended) An article of manufacture for monitoring abnormalities in a data 
stream, comprising a machine readable medium containing one or more programs which when 
executed implement the steps of: 

receiving a plurality of objects in the data stream; 

creating one or more clusters from the plurality of objects, wherein at least a portion of each 
of the one or more clusters comprises statistical data representative of the respective cluster, wherein 
the statistical data comprises a time-sensitive weight for each of the plurality of objects in each of the 
one or more clusters, the time-sensitive weight having a value that decreases at a specified rate such 
that more recently received objects are assigned a higher priority, and wherein the one or more 
clusters are condensed for maintenance at a high level of granularity as one or more cluster droplets; 

determining from the statistical data whether each of the one or more clusters is abnormal 
when compared to , wherein a cluster is abnormal when no objects in the data stream are added to the 
cluster prior to the time-sensitive weights of the cluster decreasing to a predefined value; and 

reporting at least one of the one or more clusters as an abnormal cluster of objects in the data 

stream. 
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